A risk minimization framework for information retrieval
نویسندگان
چکیده
This paper presents a novel probabilistic information retrieval framework in which the retrieval problem is formally treated as a statistical decision problem. In this framework, queries and documents are modeled using statistical language models (i.e., probabilistic models of text), user preferences are modeled through loss functions, and retrieval is cast as a risk minimization problem. We discuss how this framework can unify existing retrieval models and accommodate the systematic development of new retrieval models. As an example of using the framework to model non-traditional retrieval problems, we derive new retrieval models for subtopic retrieval, which is concerned with retrieving documents to cover many different subtopics of a general query topic. These new models differ from traditional retrieval models in that they go beyond independent topical relevance.
منابع مشابه
Risk Minimization and Language Modeling in Text Retrieval – Thesis Summary
This thesis presents a new general probabilistic framework for text retrieval based on Bayesian decision theory. In this framework, queries and documents are modeled using statistical language models, user preferences are modeled through loss functions, and retrieval is cast as a risk minimization problem. This risk minimization framework not only unifies several existing retrieval models withi...
متن کاملBayes risk-based optimization of dialogue management for document retrieval system with speech interface
We propose an efficient dialogue management for an information navigation system based on a document knowledge base. It is expected that incorporation of appropriate N-best candidates of ASR and contextual information will improve the system performance. The system also has several choices in generating responses or confirmations. In this paper, this selection is optimized as minimization of Ba...
متن کاملPerformance Evaluation of Medical Image Retrieval Systems Based on a Systematic Review of the Current Literature
Background and Aim: Image, as a kind of information vehicle which can convey a large volume of information, is important especially in medicine field. Existence of different attributes of image features and various search algorithms in medical image retrieval systems and lack of an authority to evaluate the quality of retrieval systems, make a systematic review in medical image retrieval system...
متن کاملDocument Image Retrieval Based on Keyword Spotting Using Relevance Feedback
Keyword Spotting is a well-known method in document image retrieval. In this method, Search in document images is based on query word image. In this Paper, an approach for document image retrieval based on keyword spotting has been proposed. In proposed method, a framework using relevance feedback is presented. Relevance feedback, an interactive and efficient method is used in this paper to imp...
متن کاملRegression Models for Ordinal Data : AMachine Learning
In contrast to the standard machine learning tasks of classi cation and metric regression we investigate the problem of predicting variables of ordinal scale, a setting referred to as ordinal regression. The task of ordinal regression arises frequently in the social sciences and in information retrieval where human preferences play a major role. Also many multi{class problems are really problem...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Inf. Process. Manage.
دوره 42 شماره
صفحات -
تاریخ انتشار 2006